Enhancing LTP-Driven Cache Management Using Reuse Distance Information
نویسندگان
چکیده
Traditional caches employ the LRU management policy to drive replacement decisions. However, previous studies have shown LRU can perform significantly worse than the theoretical optimum, OPT [1]. To better match OPT, it is necessary to aggressively anticipate the future memory references performed in the cache. Recently, several researchers have tried to approximate OPT management by predicting last touch references [2, 3, 4, 5]. Existing last touch predictors (LTPs) either correlate last touch references with execution signatures, like instruction traces [3, 4] or last touch history [5], or they predict cache block life times based on reference [2] or cycle [6] counts. On a predicted last touch, the referenced cache block is marked for early eviction. This permits cache blocks lower in the LRU stack–but with shorter reuse distances–to remain in cache longer, resulting in additional cache hits. This paper investigates three mechanisms to improve LTP-driven cache management. First, we propose exploiting reuse distance information to increase LTP accuracy. Specifically, we correlate a memory reference’s last touch outcome with its global reuse distance history. Second, for LTPs, we also advocate selecting the most-recently-used LRU last touch block for eviction. We find an MRU victim selection policy evicts fewer LNO last touches [5] and mispredicted LRU last touches. Our results show that for an 8-way 1 MB L2 cache, a 54 KB RD-LTP which combines both mechanisms reduces the cache miss rate by 12.6% and 15.8% compared to LvP and AIP [2], two state-of-the-art last touch predictors, and by 9.3% compared to DIP [7], a recent insertion policy. Finally, we also propose predicting actual reuse distance values using reuse distance predictors (RDPs). An RDP is very similar to an RD-LTP except its predictor table stores exact reuse distance values instead of last touch outcomes. Because RDPs predict reuse distances, we can distinguish between LNO and OPT last touches more accurately. Our results show an 64 KB RDP can improve the miss rate compared to an RD-LTP by an additional 2.7%.
منابع مشابه
Using Locality and Interleaving Information to Improve Shared Cache Performance
Title of dissertation: Using Locality and Interleaving Information to Improve Shared Cache Performance Wanli Liu, Doctor of Philosophy, 2009 Dissertation directed by: Professor Donald Yeung Department of Electrical and Computer Engineering The cache interference is found to play a critical role in optimizing cache allocation among concurrent threads for shared cache. Conventional LRU policy usu...
متن کاملUnderstanding Unfulfilled Memory Reuse Potential in Scientific Applications
The potential for improving the performance of data-intensive scientific programs by enhancing data reuse in cache is substantial because CPUs are significantly faster than memory. Traditional performance tools typically collect or simulate cache miss counts or rates and attribute them at the function level. While such information identifies program scopes that suffer from poor data locality, i...
متن کاملOn the Theory and Potential of Collaborative Cache Management
The goal of cache management is to maximize data reuse. Collaborative caching provides an interface for software to communicate access information to hardware. In theory, it can obtain optimal cache performance. In this paper, we study a collaborative caching system that allows a program to choose different caching methods for its data. As an interface, it may be used in arbitrary ways, sometim...
متن کاملReuse-Aware Management for Last-Level Caches
Variability in generational behavior of cache blocks is a key challenge for cache management policies that aim to identify dead blocks as early and as accurately as possible to maximize cache efficiency. Existing management policies are limited by the metrics they use to identify dead blocks, leading to low coverage and/or low accuracy in the face of variability. In response, we introduce a new...
متن کاملStudying the Impact of Multicore Processor Scaling on Cache Coherence Directories via Reuse Distance Analysis
Title of dissertation: Studying the Impact of Multicore Processor Scaling on Cache Coherence Directories via Reuse Distance Analysis Minshu Zhao, Doctor of Philosophy, 2015 Dissertation directed by: Professor Donald Yeung Department of Electrical and Computer Engineering Directories are one key part of a processor’s cache coherence hardware, and constitute one of the main bottlenecks in multico...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Instruction-Level Parallelism
دوره 11 شماره
صفحات -
تاریخ انتشار 2009